Optimal partitioning of data chunks in deduplication systems
نویسندگان
چکیده
منابع مشابه
Optimal Partitioning of Data Chunks in Deduplication Systems
Deduplication is a special case of data compression in which repeated chunks of data are stored only once. For very large chunks, this process may be applied even if the chunks are similar and not necessarily identical, and then the encoding of duplicate data consists of a sequence of pointers to matching parts. However, not all the pointers are worth being kept, as they incur some storage over...
متن کاملSimilarity Based Deduplication with Small Data Chunks
Large backup and restore systems may have a petabyte or more data in their repository. Such systems are often compressed by means of deduplication techniques, that partition the input text into chunks and store recurring chunks only once. One of the approaches is to use hashing methods to store fingerprints for each data chunk, detecting identical chunks with very low probability for collisions...
متن کاملFile recipe compression in data deduplication systems
Data deduplication systems discover and exploit redundancies between different data blocks. The most common approach divides data into chunks and identifies redundancies via fingerprints. The file content can be rebuilt by combining the chunk fingerprints which are stored sequentially in a file recipe. The corresponding file recipe data can occupy a significant fraction of the total disk space,...
متن کاملOptimal Partitioning for Spatial Data
It is desirable to design partitioning techniques that minimize the I/O time incurred during query execution in spatial databases. In this paper, we explore optimal partitioning techniques for spatial data for diierent types of queries. In particular, we show that hexagonal partitioning has optimal I/O cost for circular queries compared to all possible non-overlapping partitioning techniques th...
متن کاملOptimal Data Partitioning of Mpeg - 2 Coded
We analyze the problem of optimal data partitioning of MPEG-2 coded video in an operational rate-distortion context. The optimal algorithm is characterized and shown to have high complexity and delay. A causally optimal algorithm based on Lagrangian optimization is proposed, that optimally solves the problem for intra (I) pictures, while it provides an optimal solution for predicted/interpolate...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Discrete Applied Mathematics
سال: 2016
ISSN: 0166-218X
DOI: 10.1016/j.dam.2015.12.018